Investigating the Structure of Procedural Texts for Answering How-to Questions

نویسندگان

  • Estelle Delpech
  • Patrick Saint-Dizier
چکیده

This paper presents ongoing work dedicated to parsing the textual structure of procedural texts. We propose here a model for the intructional structure and criteria to identify its main components: titles, instructions, warnings and prerequisites. The main aim of this project, besides a contribution to text processing, is to be able to answer procedural questions (How-to? questions), where the answer is a well-formed portion of a text, not a small set of words as for factoid questions. 1. Situation and Aims The main goal of this work is to be able to answer procedural questions, which are questions whose induced response is typically a fragment, more or less large, of a procedure, i.e., a set of coherent instructions designed to reach a goal. Recent informal observations from queries to Web search engines show that procedural questions is the second largest set of queries after factoid questions (de Rijke, 2005). In this paper, we focus on the analysis of procedural structures in texts (titles, instructions, warnings, prerequisites, etc.). Answering procedural questions thus requires to be able to extract not simply a word in a text fragment, as for factoid questions, but a well-formed text structure which may be quite large. Analysing a procedural text requires a dedicated discourse analysis, e.g. by means of a grammar. Such grammars are not very common yet due to the complex intertwinning of lexical, syntactic, semantic and pragmatic factors they require to get a correct analysis. Producing responses which are well-formed text portions is not proper to procedural questions. Many other types of questions require texts as responses: why questions, but also evaluative or comparative questions. Next, any kind of cooperative answering framework requires the production of informational elements such as explanations, examples or arguments which are basically textual and strongly organized. Procedural texts are organized sets of instructions, they may also be sets of advices, as in social behavior texts. In our perspective, procedural texts range from apparently simple cooking recipes to large maintenance manuals. They also include documents as diverse as teaching texts, medical notices, social behavior recommendations, directions for use, assembly notices, do-it-yourself notices, itinerary guides, advice texts, savoir-faire guides etc. Even if procedural texts adhere more or less to a number of structural criteria, which may depend on the author’s writing abilities and on traditions associated with a given domain, we observed a very large variety of realisations, which makes parsing such texts quite challenging. Procedural texts explain how to realize a certain goal by means of actions which may be temporally organized. Procedural texts can indeed be a simple, ordered list of instructions to reach a goal, but they can also be less linear, outlining different ways to realize something, with arguments, advices, conditions, hypothesis, preferences. They also often contain a number of recommendations, warnings, and comments of various sorts. The organization of a procedural text is in general made visible by means of linguistic and typographic marks. Research on procedural texts was initiated by works in psychology, cognitive ergonomics, and didactics, (Mortara et ali. 1988) (Adam 1987), (Greimas 1983), (Kosseim 2000) to cite just a few. Several facets, such as temporal and argumentative structures have then been subject to general purpose investigations in linguistics, but they need to be customized to this type of text. There is however very little work done in Computational Linguistics circles. The present work is based on a preliminary experiment we carried out (Delpech et ali., 07), (Aouladomar, 05) where a preliminary structure was proposed, from corpus analysis. In this paper, we summarize our results, focussing (1) on the conceptual notion of intructional compounds, which does capture the complexity just advocated, (2) on the recognition of titles, instructions and instructional compounds and (3) on the modelling and implementation of a simple text grammar system that accounts for the overall text structure w.r.t. to procedurality. A quite comprehensive evaluation was carried out that we report here. This work is part of the French ANR-RNTL TextCoop project. 2. The structure of procedural texts: Instructional Compounds The main construction of procedural texts is the goal-plan structure. They may show a hierarchical structure composed of subgoals. This constitutes the skeleton of a procedural text. Procedural texts therefore contain two basic structures: titles, interpreted as goals (on which question matching procedures will apply), and instructions serving these goals. However, in most types of texts, we do not have just sequences of simple instructions but much more complex compounds composed of clusters of instructions. We noted that these compounds are organized around a few main instructions, to which a number of subordinate instructions, warnings, arguments, and explanations of various sorts may possibly be adjoined. Procedural texts also

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Some Foundational Linguistic Elements for QA Systems: an Application to E-government Services

Time saving and time flexibility of eGovernment procedures is more attractive than face-to-face services to citizens. Citizens may interact with government via emails, search administrative information via eGovernment portals, or even via large-public search engines. Procedural question-answering systems are of much interest to query legislation, court decisions, guidelines, procedures, etc. In...

متن کامل

Some Challenges of Advanced Question-Answering: an Experiment with How-to Questions

This paper is a contribution to text semantics processing and its application to advanced question-answering where a significant portion of a well-formed text is required as a response. We focus on procedural texts of various domains, and show how titles, instructions, instructional compounds and arguments can be extracted.

متن کامل

A Semantic Analysis of Instructional Texts

Texts as well as dialogues originated a number of analysis at discourse level from various perspectives, such as: modelling of nominal, temporal or spatial reference resolution [1, 5], of rhetorical structure [6], argumentative structure, and of cooperative discourse structure [3]. So far, little has been done to formally represent the structure of instructional (also called procedural) texts, ...

متن کامل

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

Proceduralization and Transfer of Linguistics Knowledge as a Result of Form-focused Output and Input Practice

This study compared the effects of two types of form-focused tasks on proceduralization and transfer of linguistics knowledge in case of English modals. All participants of the study attended pretests, posttests and delayed posttests. The procedural comprehension and production knowledge were measured through the groups’ performance on a timed dual task test that resembled the context of practi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008